Skip to content

Conversation

@willmj
Copy link
Collaborator

@willmj willmj commented Apr 29, 2025

Goes with #545 to allow all-linear for ScatterMoE. This PRs always filters out the router in the LoRA state dict, which should be reversed when expert tuning support for LoRA layers is added in, and has been noted as such. This will allow users to pass all-linear as target modules when using ScatterMoE

Signed-off-by: Will Johnson <[email protected]>
@willmj willmj requested a review from fabianlim as a code owner April 29, 2025 17:52
@willmj willmj requested a review from kmehant April 29, 2025 17:52
willmj added 2 commits April 29, 2025 14:53
Signed-off-by: Will Johnson <[email protected]>
Signed-off-by: Will Johnson <[email protected]>
[testenv]
deps =
pytest>=7
importlib-metadata
Copy link
Collaborator

@anhuong anhuong Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting this change: Error coming from linting failing to import from torch

.tox/py/lib/python3.9/site-packages/torch/distributed/elastic/rendezvous/__init__.py:142: in <module>
    from .registry import _register_default_handlers, _register_out_of_tree_handlers
.tox/py/lib/python3.9/site-packages/torch/distributed/elastic/rendezvous/registry.py:19: in <module>
    from importlib_metadata import entry_points
E   ModuleNotFoundError: No module named 'importlib_metadata'

Which looks like something that was added 7 months ago - https://github.com/pytorch/pytorch/blame/main/torch/distributed/elastic/rendezvous/registry.py#L18

So not sure why this is coming up as a new error when it didn't fail previously. But this should not affect the image build since we run on python 3.12 and this import only happens on python < 3.10.

@willmj willmj merged commit 2aeeca4 into foundation-model-stack:main Apr 30, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants